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Abstract 

Message passing programs commonly use buffers to avoid unnecessary synchroniza- 
tions and to improve performance by overlapping communication with computation. 
Unfortunately, using buffers makes the program no longer portable, potentially un- 
able to complete on systems without a sufficient number of buffers. Effective buffer 
use entails that the minimum number needed for a safe execution be allocated. 

We explore a variety of problems related to buffer allocation for safe and efficient 
execution of message passing programs. We show that determining the minimum 
number of buffers or verifying a buffer assignment are intractable problems. How- 
ever, we give a polynomial time algorithm to determine the minimum number of 
buffers needed to allow for asynchronous execution. We extend these results to sev- 
eral different buffering schemes, which in some cases make the problems tractable. 

Key words: Message passing systems, Buffer allocation, Complexity, Parallel and 
distributed programming 



1 Introduction 



In the last decade MPI [8] and PVM [14] have become the de facto standards 
for message passing programs. They have replaced the myriad of libraries that 
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provided a degree of portability for message passing programs. One aspect 
of portability introduced in the MPI standard was that of a safe program. 
As defined in the standard, a program is safe if it requires no buffering, that 
is, if it is synchronous. Safe programs can be ported to machines with differ- 
ing amounts of buffer space. However, to demand that the program execute 
correctly with no buffering is restrictive. Buffering reduces the amount of syn- 
chronization delay and also makes it possible to off-load communication to the 
underlying system or network components, thus overlapping communication 
and computation. Although one cannot assume an infinite number of buffers, 
by characterizing the buffer requirements of a given program it becomes possi- 
ble to determine, with respect to buffer availability, whether the program can 
be ported to a given machine. The notion of fc-safety is introduced to address 
the problem of identifying the buffer requirements of a program to avoid buffer 
overflows and deadlock. Determining the minimum k, under a variety of buffer 
placements, is important for constructing programs that are both safe and can 
effectively exploit the underlying hardware. 

Unfortunately, the value of k is usually not known a priori. We investigate 
the complexity of determining a minimum value of k for programs using asyn- 
chronous buffered communication with a static communication pattern and a 
bounded message size. We consider the following three problems: the Buffer 
Allocation Problem (BAP), which is the problem of determining the minimum 
number of buffers required to ensure deadlock free execution (i.e., determine 
k for fc-safety); the Buffer Sufficiency Problem (BSP), which is to determine 
whether a given buffer assignment is sufficient to avoid deadlock; and finally, 
the Nonblocking Buffer Allocation Problem (NBAP), which is to determine the 
minimum number of buffers needed to allow for asynchronous execution, that 
is, when send calls do not block. 

The complexity of these questions also depends on the type of buffers provided 
by the system. We consider four types of system buffering schemes. In the first 
three schemes the buffers are (1) pre-allocated on the send side only, (2) the 
receive side only, or (3) mixed and pre-allocated on both sides. Finally, we 
also consider a scheme that pre-allocates buffers on a per channel basis, where 
each communication channel can buffer a fixed number of messages. 

We show that the Buffer Allocation Problem is intractable under all four 
buffer allocation schemes. The Buffer Sufficiency Problem is intractable for the 
receive side buffer and for the mixed buffer allocation schemes, tractable for 
the channel scheme and conjectured tractable for sender side buffers. Finally, 
the Nonblocking Buffer Allocation Problem is tractable for all buffer placement 
schemes, except the mixed send and receive scheme. 
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2 Related Work 



The multiprocess system that we consider is a collection of simultaneously 
executing independent asynchronous processes that compute by interspers- 
ing local computation and point-to-point message passing between processes; 
these are referred to as A- computations in [4]. Such a system is equivalent to 
one with three different events, such as the one defined by Lamport [18] : send 
events, receive events and internal events. As well, we only consider programs 
that are repeatable [6,7] when executed in an unrestricted environment, that 
is, programs with static communication patterns. While this narrows the class 
of programs we consider, the class of applications with static communication 
patterns is still considerable. 

The message passing primitives considered in this paper are the traditional 
asynchronous, buffered communications: the nonblocking send and the blocking 
receive, which are the standard primitives used in MPI and PVM. Cypher 
and Leu formally define the former as a POST-SEND immediately followed 
by a WAIT-FOR-BUFFER-RELEASE and the latter as a POST-RECEIVE 
immediately followed by a WAIT-FOR-RECEIVE- TO-BE-MATCHED [6,7]. 
Informally, the send blocks until the message is copied out of the process into 
a send buffer; the receive blocks until the message has been copied into the 
receive buffer. 

The notion of safety, as introduced in the MPI standard, underscore the con- 
cern that, when buffer resources are unknown, asynchronous communication 
can potentially deadlock the system. This notion was extended to fc-safety, in 
order to better characterize the buffer requirements of the program, thus mak- 
ing it safe to take advantage of asynchronous communication. The definition 
of fc-buffer correctness was introduced by Bruck et al. [2] to describe pro- 
grams that complete without deadlock in a message passing environment with 
k buffers per process. Similarly, Burns and Daoud [3] introduced guaranteed 
envelope resources into LAM [12], a public domain version of MPI. Guaran- 
teed envelope resources — a weaker condition than fc-safety — was used in LAM 
to reserve a guaranteed number of message header slots on the receiver side. 

Determining whether a system is buffer independent — the system is 0-safe — 
was investigated in [6,7]. In our model, the interesting systems are buffer- 
dependent, and require an unknown number of buffers to avoid deadlock. 

More recently in modern clusters, greater overlap of computation and com- 
munication is possible by off-loading communication onto the network inter- 
face cards. Unfortunately, most NICs have orders of magnitude less memory 
than the average host, which makes message buffers a limited resource. Thus, 
programs that use asynchronous message passing, and that execute correctly 
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otherwise, might deadlock when executing on a system where parts of the 
message passing system have been off-loaded to the NIC. These issues have 
been investigated in several papers [8,9,11,17]. 

To determine the minimum number of buffers, the execution of a system can 
be modeled using a (coloured) Petri net [16]. In order to determine whether 
the system can reach a state of deadlock, the Petri net occurrence graph [15] 
is constructed, and a search for dead markings is performed. However, the size 
of the occurrence graph is exponential in the size of the original Petri net. 

Variations of these problems have been investigated by the operations research 
community [1,20,21]. In these models, events or products are buffered between 
various stations in the production process, however, the arrival of these events 
is governed by probability distributions, which are specified a priori. In our 
model, since processes are asynchronous, the time for a message to arrive is 
non-deterministic; that is, a message may take an arbitrarily long time to 
arrive and a process may take an arbitrarily long time to perform a send or a 
receive. 



3 Definitions 

Let S be a multiprocess system with n processes and E t communication events 
occurring in process i; a communication event is either a send or a receive. A 
multiprocess system S is unsafe if a deadlock can occur due to an insufficient 
number of available buffers; if S is not unsafe, then S is said to be safe. 
Figure 1 is an example of an unsafe system. The numbers above the graph in 
Figure 1 represent the buffer assignment. 

P 1 P 2 P 3 
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Fig. 1. Order of execution can cause deadlock. 

A per-process buffer assignment is an n-tuple B = (&i, b 2 , . . . , b n ) of non- 
negative integers representing the number of buffers that can be allocated 
by each process. Similarly, a per-channel buffer assignment is a g-tuple B = 
(bi, & 2 , . . . ,b q ), q — (™) ) representing the number of buffers each channel in the 
system can allocate. Since buffers take up memory, which may be needed by 
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the application, ideally, as few buffers as possible should be allocated. However, 
allocating too few buffers results in an unsafe system. 

Buffer utilization is the nondeterministic phenomena of interest in the system. 
Making the choice of when to use a buffer affects future choices. For exam- 
ple, in Figure 1, using a buffer for communication f before communication 3 
completes results in deadlock. 

Two natural decision problems arise from this optimization problem. Given 
a system S and a non-negative integer k, the Buffer Allocation Problem 
(BAP) is to decide if there exists a buffer assignment B such that S is safe 
and J2h < k. In order to solve this problem we need to solve a simpler one. 
Suppose we are given a buffer assignment B and a system S; the Buffer 
Sufficiency Problem (BSP) is then to decide whether the assignment is 
sufficient to make S safe. 

Additionally, we can require that no process in system S should ever block 
on a send. Given a system S and a non-negative integer k, the Nonblocking 
Buffer Allocation Problem (NBAP) is to decide whether there exists a 
buffer assignment B, such that no send in S ever blocks, and J2h < k. 

We model systems by using communication graphs, and executions of systems 
by colouring games on these graphs. Communication graphs can be derived 
from execution traces of a program. The following subsection defines the graph 
based framework used throughout this paper. 

3. 1 The Graph Based Framework 

A communication graph of S is a directed acyclic graph G = G(S) = (V, A) 
where the set of vertices V = {v iiC \ 1 < i < n, < c < (£^+1)} corresponds to 
the communication events and the arc set A consists of two disjoint arc sets: 
the computation arc set P and the communication arc set C. Each vertex 
represents an event in the system: vertex v i>0 represents the start of process i, 
vertex v iyC) 1 < c < E i: represents either a send or a receive event, and vertex 
Vi^Ei+i) represents the end of a process. An arc, (i> ijC , i>i, c +i) G P, < c < Ei, 
represents a computation within process i and an arc (vi tS , Vj tt ) G C represents 
a communication between different processes, % and j, where Vi tS is a send 
vertex, and Vj >t is a receive vertex (e.g. Figure 2). Note, the process arcs are 
drawn without orientation for clarity; they are always oriented downwards. 
Communication graphs are comparable to the time-space diagrams — without 
internal events — noted in [18]. 

The ith process component Gi of G is the subgraph Gi = (Vi, AA where 
Vi = K c G V | < c < (Ei + 1)} and A { = {(v hC , v w ) G A | < c < 
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Fig. 2. An example of a communication graph with a 2-ring. 

Ei}. The process component corresponds to a process in S. We construct 
communication graphs by connecting process components with arcs. Hence, it 
is more intuitive to treat a process component as a chain of send and receive 
vertices bound by a start and an end vertex. A channel is represented by a 
channel pair (Gj, Gj) of process components. 

A t-ring is a subgraph of a communication graph G(S), consisting of t > 1 
process components, such that in each of the t process components there is a 
send vertex Sj. c . and a receive vertex n. ^ ., c, < dL-, 1 < j < £ such that the 
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rf ), 1 < j < t are in A. This definition is 



equivalent to the definition of a crown in [4] . 

A t-ring represents a circular dependence of alternating send and receive 
events; see the example in Figure 3. The shaded arcs in Figure 3 show how 
each receive event depends on the preceding send event and each send event 
depends on the corresponding receive event. Thus, without an available buffer, 
there is a circular dependency that results in the system deadlocking. 




Fig. 3. Dependency cycle in G(S). 

To model the execution of a system S, we define a colouring game that simu- 
lates the execution of the system with respect to G(S). 



3.2 Colouring the Communication Graph 



Given a communication graph G(S), an execution of a corresponding system S 
is represented by a colouring game where the goal is to colour all vertices green; 
a green vertex corresponds to the completion of an event. We use three colours 
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to denote the state of each event in the system: a red vertex indicates that 
the corresponding event has not yet started, a yellow vertex indicates that 
the corresponding event has started but not completed, and a green vertex 
indicates that the corresponding event has completed. Hence, a red vertex 
must first be coloured yellow before it can be coloured green; this corresponds 
to a traffic light changing from red, to yellow, to green. 1 

We use tokens to represent buffer allocations. A buffer assignment of a process 
(or channel) is represented by a pool of tokens associated with the correspond- 
ing process component (respectively, the channel component). A instance of 
buffer utilization is represented by removing a token from a token pool and 
placing it on the corresponding communication arc. 

The colouring game represents an execution via the following rules. Initially, 
the start vertices of G are coloured green and all remaining vertices are 
coloured red; this is called the initial colouring. 



send-^yel A red send vertex may be coloured yellow if the preceding 
vertex is green — the send is ready. 

recv^-yel A red receive vertex may be coloured yellow if the corre- 
sponding send vertex is yellow, and the preceding vertex (in 
the same process component) is green — both the send and 
the receive are ready. 

recv-^yel A red receive vertex may be coloured yellow if the corre- 
sponding send vertex is yellow, and a token from the corre- 
sponding token pool is placed on the incident communication 
arc — the send is ready and a buffer is used. 

send-^-grn A yellow send vertex may be coloured green if the corre- 
sponding receive vertex is coloured yellow — the communica- 
tion has completed from the sender's perspective. 

recv^grn A yellow receive vertex may be coloured green if both of its 
preceding vertices are green. If the incident communication 
arc has a token, the token is returned to its token pool — a 
receive completes after the send completes. 

end-^-yel A red end vertex may be coloured yellow if the preceding 
vertex is green. 

end-^grn A yellow end vertex may be coloured green. 



Buffer utilization is represented by placing a token from the token pool on 
the selected arc, and colouring the corresponding receive vertex yellow. If no 



Naturally, we refer to a European traffic light. 
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tokens are available, the rule cannot be invoked. 

A colouring of G, denoted by x, is a colour assignment to all vertices, which 
can be obtained by repeatedly applying the colouring rules, starting from the 
initial colouring. A colouring sequence £ = (xi, X2, •••) is a sequence of 
colourings such that each colouring is derived from the preceding one by a 
single application of one of the colouring rules. An execution of a multiprocess 
system S with buffer assignment B is represented by a colouring sequence 
on G(S). Each transition, from one colouring to the next, within a colouring 
sequence, corresponds to a change of state of an event in the corresponding 
execution. Assuming that all events in the system are ordered, there is a 
correspondence between the colouring sequences on G(S) and the executions 
of system S. Using the correspondence between colouring sequences on G(S) 
and executions of system S, we reason about system S by reasoning about 
colouring sequences on G(S). 

We say that a colouring sequence completes if and only if the last colouring in 
the sequence comprises only green vertices. A colouring sequence deadlocks 
if and only if the last colouring in the sequence has one or more non-green ver- 
tices and the sequence cannot be extended via the application of the colouring 
rules. A system S is safe if and only if every colouring sequence on the graph 
G(S) completes. 

We say that a colouring sequence blocks if there exists a sequence on G(S), 
ending with a colouring containing a yellow send vertex, that cannot be ex- 
tended by applying rule recv-^yel to the corresponding receive vertex. A 
colouring sequence is block free if every prefix of the sequence does not 
block; a communication graph G, is block free if all colouring sequences on it 
are also block free. If G(S)) is block free, then no send in S will ever block 
during an execution. 

A token assignment, also denoted by B, is a list of nonnegative integers, 
denoting the number of tokens assigned to each token pool; the token assign- 
ment is the abstract representation of a buffer assignment. The number of 
tokens required depends on the number of times that rule recv-^yel can be 
invoked. If a token pool is empty, this means all buffers are in use. 



4 Useful Lemmas 

The following lemmas are used throughout our proofs. Lemma 4.1 character- 
izes the conditions under which a colouring sequence will deadlock. Lemma 4.2 
characterizes conditions under which a single colouring sequence may repre- 
sent all possible colouring sequences. Finally, Lemma 4.3 characterizes a class 
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of communication graphs on which no colouring sequence will deadlock. 

Lemma 4.1 (The t-Ring Lemma) Let G be a communication graph com- 
prising a single t-ring. Any colouring sequence on G completes if and only if 
rule recv-^yel is invoked at least once. 

Proof: Assume by contradiction that there exists a complete colouring se- 
quence S that does not make use of rule recv-^yel. Consider the first colouring 
in E where one of the send vertices is green; call the vertex Sj. Let rj be the 
corresponding receive vertex. According to rule send^grn, the vertex rj must 
be yellow. Since rule recv-^yel has not been applied, rule recv-^yel must have 
been invoked earlier in the sequence. By the definition of a t-ring, the send 
vertex Sj must be the predecessor of rj. Since the rule recv— >yel was applied 
to rj, Sj must be green. Hence, there is an earlier colouring in X with a green 
send vertex. This is a contradiction. 

In the other direction, if rule recv-^yel is invoked on receive vertex rj, then 
rule send-^grn can be invoked on the corresponding send vertex Sj, breaking 
the circular dependency. ■ 

Define the dependency graph of communication graph G = (V, A) to be H = 
(V, E) where all process arcs in A are reversed in E and all communication 
arcs in A are bidirectional in E. Define the depth d(v) of a vertex v e V to 
be the maximum length path in H from v to a start vertex. 

Lemma 4.2 Let G be communication graph with a token assignment ofO. For 
any vertex v in G, if there exists a colouring sequence that colours vertex v 
green, there does not exist a colouring sequence that deadlocks before colouring 
v green. 

Proof: Proof by contradiction. Assume that there exist two colouring se- 
quences, such that one colouring sequence colours a vertex green and the 
other deadlocks and does not colour the vertex green. Let v G V be such a 
vertex of minimum depth; that is, all vertices of lesser depth will be coloured 
green eventually by any colouring sequence on G. In order for a vertex to be 
coloured green, its component predecessor must be green. Since the depth of 
the predecessor is less than the depth of v, it can always be coloured green. 
Furthermore, since a send and its corresponding receive vertex are adjacent 
to each other, their depths differ by at most 1. 

Since v must be a communication vertex, by rules send^grn and recv^grn, 
the adjacent communication vertex t must be coloured yellow before v can be 
coloured green. If vertex t is of a lesser depth than v, then t must be green 
colourable in all colouring sequences; hence, v must also be green colourable. 
If t is at the same depth as v, then its component predecessor is at a lesser 
depth and must be green colourable, hence t is yellow colourable, and v is 
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green colourable. If t is at a greater depth than v, the component predecessor 
of t, say u, is at the same or a lesser depth than t. If the latter, then u is green 
colourable and t is yellow colourable, otherwise, we apply the same argument 
to u first. Since there is no path from u to v in H — because d{u) < d(v) — we 
need only recurse a finite number of times. ■ 

Lemma 4.3 IfG is a communication graph whose dependency graph is acyclic, 
then no colouring sequence on G will deadlock. 

Proof: Proof by contradiction. Assume that a colouring sequence deadlocks 
on G. Let v be the vertex of minimum depth that cannot be coloured green. If 
v is a send (receive) vertex, let u be the corresponding receive (send) vertex. 
Let vertex t be the component predecessor of vertex u and let vertex w be 
the component predecessor of vertex v. Since the dependency graph is acyclic, 
the depths of both t and w are less than the depths of u and v. Hence, both t 
and w may be coloured green based on our minimality assumption. However, 
then both u and v may be coloured green; this is a contradiction! If v is an 
end vertex, then it has only one predecessor, which is of a lesser depth, which 
leads to the same contradiction. ■ 



5 Buffer Allocation in Systems with Receive Side Buffers 

In systems with receive side buffers, messages are buffered only by the re- 
ceiver. Buffers are allocated by the receiving process when a message arrives, 
but cannot be received, and are freed when the message is received by the ap- 
plication. Analogously, when colouring a receive vertex of the corresponding 
communication graph, only a token belonging to the same process component 
may be used. We call this the receive side allocation scheme. 

5.1 The Buffer Allocation Problem 

In order to prevent deadlock in distributed applications, the underlying sys- 
tem needs to allocate a sufficient number of buffers. Ideally, it should be the 
minimum number required. Unfortunately, determining the required number 
of buffers, such that the system is safe, is intractable. 

The corresponding graph-based decision problem is this: given a communica- 
tion graph G and a positive integer k, determine if there is a token assign- 
ment of size k such that no colouring sequence deadlocks on G. We show 
that BAP r is NP-hard by a reduction of the well known 3SAT problem [5] 
to BAP r . Recall the definition of 3SAT: determine if there exists a satisfying 
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assignment to Ar=i( a « V V of), where aj, and q are Boolean literals in 

%li ^2) "^2) • • • 5 ^ri) %n}- 

Theorem 5.1 The Buffer Allocation Problem (BAP r j is NP-hard. 

Proof: Proof by reduction of 3SAT to BAP r . For any 3SAT instance F 
we construct a corresponding communication graph G such that for a token 
assignment of size n, any colouring sequence completes on G if and only if the 
corresponding variable assignment satisfies F. 

Let F be an instance of 3SAT with n variables and c clauses; the variables 
are denoted x\, x-i, ■ ■ ■ , x n , and the jth clause is denoted (aj V bj V Cj), where 
a,j,bj,Cj G {xi,xi, . . . ,x n ,x n }. The corresponding communication graph G 
comprises 2n + 1 process components: In of the components — called literal 
components — are labeled P Xi and P Xi , % — 1 . . . n, and correspond to the literals 
of F. The last component — called the barrier component — is labeled Pbarrier- 

Each process component is divided into c + 1 epochs, where each epoch is 
a consecutive sequence of zero or more vertices within the component. All 
epochs are synchronized, that is, the vertices of one epoch must be coloured 
green before any of the vertices in the next epoch may be coloured. To ensure 
this we use a barrier component; the jth epoch of the barrier component, 
j — 0, . . . , c, comprises 2n receive vertices, labeled qij, and 2n send vertices, 
labeled tij, I G {x±,Xi, . . . ,x n ,x n }. At the end of each epoch there is an arc 
from each of the literal components Pi, I G {xi }, to the barrier 

component. Each arc emanates from vertex sij, called a barrier send vertex, 
and is incident on vertex qij, where I G {x±, x±, . . . , x n , x n } and j = . . . c. 
These arcs are followed by arcs emanating from the barrier component to the 
literal components; the arcs emanate from vertices tij and are incident on 
vertices rij, called barrier receive vertices. The barrier widget has no cyclic 
dependencies. Hence, by Lemma 4.3, no colouring sequence will deadlock on 
a barrier widget. 

Epoch fixes a token assignment corresponding to a variable assignment in 
3SAT. Each pair of process components, P Xi and P Xi , i = 1 . . . n, forms a 
variable widget, which corresponds to a variable. The two process components 
of a pair share a 2-ring; see Figure 4. By Lemma 4.1, at least one token must 
be assigned to either process component P Xi or P Xi to prevent all colouring 
sequences from deadlocking on G. Since only n tokens are available, each 
component pair can be assigned exactly one token. Finally, assigning the token 
to process component, P Xi or P x ., corresponds to fixing variable Xi to true or 
false. The epoch terminates with a barrier send vertex s/ i , followed by a 
barrier receive vertex r Ji)0 , 

Epoch j of each process component corresponds to the jth clause of F. The 
epoch of a process component Pi, I 7^ a,j, bj, Cj — not labeled by a literal of the 
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jth clause — contains only two vertices: the barrier send vertex sij and the 
barrier receive vertex rij. The three process components, P a , , P c ., whose 
labels correspond to the literals in the jth clause share a 3-ring in the jth 
epoch; see Figure 4. By Lemma 4.1, to avoid deadlock, at least one of the 
three process components must have a token. If none of the components are 
assigned a token, all literals in the jth clause are false. The epoch is terminated 
by the barrier send and the barrier receive vertices. 

A satisfying assignment on F satisfies at least one literal in every clause. A 
corresponding token assignment assigns a token to the corresponding pro- 
cess component in each 3-ring — corresponding to the true literal. Hence, by 
Lemma 4.1 none of the colouring sequences will deadlock on any of the clause 
widgets and any colouring sequence on G will complete. 

For a falsifying assignment of F, there exists at least one clause compris- 
ing false literals. The corresponding token assignment fails to assign any to- 
kens to the process components that share the corresponding 3-ring. Thus, by 
Lemma 4.1 all colouring sequences will deadlock in that clause widget. 

Hence, for a token assignment of size n, any colouring sequence on G will 
complete if and only if the corresponding assignment satisfies F. Since finding 
a token assignment of size n such that no colouring sequence on G deadlocks 
is as hard as finding a satisfying assignment for F, BAP r is NP-hard. ■ 
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5.2 The Buffer Sufficiency Problem 



A potentially simpler problem involved verifying whether a given buffer as- 
signment is sufficient to prevent deadlock. Formally, given a graph G and a 
token assignment on G, determine if none of the colouring sequences on G 
deadlock. This problem turns out to be intractable as well. 

We show that BSP r is coNP-complete by a reduction from the TAUTOLOGY 
problem [13, Page 261] to BSP r . Given an instance of a formula in disjunc- 
tive normal form (DNF), \l\ =1 Aj=i a i,j where a it j G {x 1 ,x±, . . . ,x n ,x n }, the 
formula is a tautology if it is satisfied by all assignments. An assignment that 
falsifies F is a concise proof that the formula is not a tautology. We shall 
restrict our attention to 3DNF formulas, where each term has three literals: 
\Jl =1 (ai Abi Aci). 

Theorem 5.2 The Buffer Sufficiency Problem (BSP r ) is coNP- complete. 

Proof: Let G be a communication graph along with a token assignment. If 
there exists a deadlocking colouring sequence on G, then the sequence itself 
is a certificate. The sequence is at most twice the number of vertices in G. 
Hence, BSP r is in coNP. 

Let F be a 3DNF formula with t terms where each term has three literals. For 
any 3DNF formula F, we construct a communication graph G and fix a token 
assignment such that there is a colouring sequence on G that deadlocks if and 
only if the corresponding assignment falsifies F. The construction consists of 
four types of widgets that correspond to fixing an assignment, a term in the 
disjunction, the disjunction, and the interconnects between widgets. 

Each variable in F is represented by a variable widget comprising three process 
components that are labeled P Xi , P Xi , and Qi. The latter, called the arbitra- 
tor component, comprises three receive vertices, labeled q iy r Xi , and r Xi . The 
former two process components, called variable components, comprise two 
send vertices each. The first, labeled s Xi (s Si ), is adjacent to the correspond- 
ing receive vertex r Xi (r Xi ) in the arbitrator component. The second, labeled t Xi 
(txi), is adjacent to receive vertices in widgets called dispersers, described later. 
The vertex in the arbitrator component is similarly adjacent to a vertex in 
a disperser widget. The corresponding token assignment for each variable wid- 
get assigns one token to Qi and no tokens to the other two components; see 
Figure 5. The widget has the following property: 

Property 5.3 Let G be a communication graph that contains a variable wid- 
get. Any colouring sequence on G may colour exactly one of the two vertices 
t Xl or t Xi yellow before vertex qi is coloured green. 
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Fig. 5. The construction. 



Proof: By rule send—>yel, in order for t Xi (t Xi ) to be coloured yellow, vertex 
s Xi (s Xi ) must be coloured green. Hence, by rule send^grn, vertex r Xi (r Xi ) 
must first be coloured yellow. Since vertex ^ is red, vertex r Xi (r Xi ) can only be 
coloured yellow via rule recv—yyel. However, there is only one token assigned 
to process component Qi, hence rule recv-^-yel may only be invoked once. ■ 

The jth term in the disjunction is represented by a term widget comprising a 
process component, which is called the term component and labeled Pj. The 
first part of each term component consists of a send vertex Sj and a receive 
vertex r^-; these vertices are part of a t-ring. In the first term component, P 1; 
there is an additional send vertex labeled Sdone; these are described in the next 
paragraph. The second part of each term component consists of three receive 
vertices labeled r 3 - )0 ., r^, and rj tCj , where a,-, bj, Cj G {x\, x±, . . . , x n , x n } cor- 
respond to the literals in the jth term; see Figure 5. These receive vertices 
are adjacent to send vertices in widgets called dispersers, which are described 
later. The term components are used to construct a disjunction widget. 

The disjunction widget comprises t term components, where the first two 
vertices, Sj and rj, are part of a t-ring spanning all t components. Specifically, 
each send vertex Sj, j < t, is adjacent to receive vertex r J+1 and vertex s t is 
adjacent to receive vertex r\] see Figure 5. Each term component is assigned 
one token. The disjunction widget has the following property. 

Property 5.4 Let G be a communication graph that contains a disjunction 
widget. Any colouring sequence on G can colour rj, j G green if and only 
if at least one ofr^, k G [1, t], is coloured yellow before any r^^ ak , Tk,b k , or rk, Ck 
are coloured yellow. 

Proof: By Lemma 4.1, vertex Tj can be coloured green, if and only if rule 
recv-^yel is invoked, colouring one of the receive vertices r^, k G yellow. 
The rule may only be invoked if and only if a token is available. Since each 
term component only has one token assigned and since vertex precedes 
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vertices r k ^ k , r k,b k , and r kjCk , a token is available if and only if none of the 
vertices r k>ak , r k:bk , and r k>Ck , are coloured yellow via rule recv-^yel, before 
vertex r k is coloured yellow. ■ 

Once vertex r k , k G [1, t], is coloured yellow, all rj, j — 1 . . . t may be coloured 
green, and vertex Sdone m ay be coloured yellow. We now describe how the 
widgets are connected together using disperse widgets. Let s be a send vertex 
and R be a set of receive vertices. An (s, i?)-disperser comprises \R\ + 1 process 
components: one master component, labeled M s , and \R\ slave components 
labeled S r , r G R. The master component comprises one receive vertex labeled 
r s , followed by \R\ send vertices labeled s r , r G R. Each send vertex is adjacent 
to the receive vertex on the corresponding slave component S r . Each slave 
component has two vertices: a receive vertex q r , followed by a send vertex t r ; 
see Figure 6. The latter vertex is adjacent to the receive vertex r in some 
other widget. None of the components are assigned any tokens. The following 
property of a disperser follows from Lemma 4.3. 



*\ s„ 2 S v 





Fig. 6. The disperser widget. 

Property 5.5 Let G be a communication graph containing an (s, R)-disperser. 
If a colouring sequence colours vertex r s yellow, then the colouring sequence 
can be extended to colour all vertices t r , r G R yellow. 

Let R Xi , % — 1 . . .n, be the set of receive vertices labeled r^ x . e Pj, j G [l,t], 
and let R Si be similarly defined; recall that aj,bj,Cj are simply literal place 
holders in the vertex labels rj j0 ., r Jife ., r^ c .. Hence, a (t Xi , i? Xi )-disperser con- 
nects send vertex t Xi G P Xi to vertices in R Xi — belonging to the term com- 
ponents. Furthermore, let Q be the set of receive vertices qi (in the variable 
widgets), i — 1 . . . n; a (sdonc, Q)-disperser connects vertex s done to all variable 
widgets via receive vertices q^. The construction of G comprises n variable 
widgets and one disjunction widget, composed of t term widgets; these are 
connected together by a (sdono Q)-disperser, and 2n (t a , i? a )-dispersers, where 
a G {xi,Xi, . . . ,x n ,x n }. We claim that there exists a colouring sequence that 
deadlocks on G if and only if there is a falsifying assignment for formula F, 
that is, F is not a tautology. 



Suppose that F has a falsifying assignment x, that is every term in the dis- 
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junction is false because each term has a literal Xi or Xi, which is false. To 
construct a colouring sequence on G that deadlocks, we construct a set of 
vertices U. The first half of the colouring sequence is a maximal colouring 
sequence involving only the vertices of U. The second half of the sequence 
may involve all vertices in G. The resulting colouring sequence will always 
deadlock. 

Let X = {a G {x±,Xi, . . . ,x n ,x n } \ a\ x = 0}, which is the set of literals that 
are false, and let Z = {s a G P a | a G" X} U {sj \ j — 1, . . . , i}, which contains 
the set of send vertices from the variable components that are labeled by a 
true literal and the numbered send vertices in the disjunction widget; the set 
Z contains the vertices which may not initially be coloured. Let U = V\Z be 
the rest of the vertex set. 

Consider a colouring sequence involving only vertices in U . By property 5.3 
any maximal colouring sequence will colour the vertices t a yellow (in the vari- 
able widget), where a G X. Hence, by property 5.5 the vertices t r (in the 
dispersers) will be coloured yellow, where r G U a ex -^a — the send vertices t r 
in the dispersers are adjacent to the receive vertices in R a . Since a; is a falsify- 
ing assignment, every term contains a literal, which is falsified by x. Without 
loss of generality, let dj denote a literal that is false in the jth term; therefore, 
process component Pj contains a receive vertex rj ta ., which is adjacent to the 
yellow send vertex t r . a (in the disperser). Since none of the vertices of the 
t-ring (in the disjunction widget) are not in U — they are still coloured red— 
the token belonging to component Pj is used to apply rule recv-^yel to colour 
vertex r^ aj yellow. Since every term has a false literal, the colouring sequence 
colours a receive vertex r^ aj , j — 1 . . . t in every term component Pj . After the 
sequence cannot be extended, allow all vertices to be coloured; since vertices 
rj taj (in the term components), j = 1 . . . t, have been coloured yellow before 
vertex r-j (in term component Pj), according to property 5.4, the sequence will 
deadlock. 

If a colouring sequence on G deadlocks, according to property 5.4, deadlock 
occurs only if there is a yellow vertex labeled rk, ak , r k,b k , or r\ fik in each of the 
term components. Their predecessors — vertices ti, I G {xi,x±, . . . ,x n ,x m }, in 
the dispersers — must be green. Since the colouring sequence is maximal, by 
property 5.3 exactly one of t Xi or t Xi is red, thus this corresponds to a valid 
assignment: setting Xi — if t Xi is green, or = 1 if t Xi is green yields an 
assignment that falsifies F. 

Thus, a colouring sequence on G deadlocks if and only if the corresponding 
assignment falsifies F. Hence, BSP r is coNP-complete. ■ 

Therefore, just determining whether a buffer assignment is sufficient is in- 
tractable, even one as simple as in the preceding example. Intuitively, the 
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buffers of a process are assigned based on the behaviour of other processes; 
thus, buffer utilization is not locally decidable. Further, the order in which 
buffers are assigned is nondeterministic, exploding the search space of possi- 
ble buffer utilizations. This phenomena, which our proofs rely on, is what we 
call buffer stealing. For example, in a system corresponding to the variable 
widget (see Figure 5), the first process to send its message gets the buffer, and 
the other process remains blocked until the arbitrator performs the receives. 
This stealing corresponds to fixing a value of a variable. Similarly, the sys- 
tem corresponding to the disjunction widget allocates buffers for each of the 
term processes. However, if the buffer is stolen in all terms, corresponding to 
a falsifying assignment, then the system will deadlock within the ring. 

For completeness, we note the following corollary: 

Corollary 5.6 The Buffer Allocation Problem (T3AP ry ) is in S 2 P. 

Proof: By Theorem 5.2, verifying that a token assignment is sufficient to 
prevent deadlock (BSP r ) is coNP-complete. Since we can nondeterministically 
guess a sufficient token assignment, the result follows. ■ 

5.3 The Nonblocking Buffer Allocation Problem 

In addition to the system being safe, we can require that no sending process 
ever blocks due to insufficient buffers on the receiving process. The Nonblock- 
ing Buffer Allocation Problem (NBAP r ) is to determine the minimum number 
of buffers needed to achieve nonblocking sends. 

Formally, the corresponding decision problem is this: given a communication 
graph G and an integer k, determine if there exists a token assignment of 
size k such that no colouring sequence on G blocks. Recall that a colouring 
sequence does not block if, whenever a send vertex is coloured yellow, rule 
recv-^yel may be applied to the corresponding receive vertex. 

Let Pi and Pj, j ^ i, be two process components. Given two vertices, i>j )C 
and v ijt , in Pj, t > c, vertex v ijt is communication dependent on vertex 
Vi tC if fj jC is the start vertex or if there exists a vertex Vj t d G Pj, such that 
there is a path from t>j jC to Vj t a and the arc (vj t a, fi,t) is in A (see Figure 7). 
Vertex v^t is terminally communication dependent on vertex Vi jC if v^t 
is communication dependent on v^ c and is not communication dependent on 
the vertices v iy i, c < I < t. The algorithm depicted in Figure 8 computes an 
optimal token assignment such that no colouring sequence on G can block. 

Remark 5.7 In a system corresponding to communication graph G, the time 
between a message arriving at process i and its receipt corresponds to the 
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Fig. 7. Vi t t is communication dependent on Vi :C . 



(1) For each receive vertex v^t determine its terminal communication 
dependency, vertex Vi jC , where t > c. 

(2) Set I ijt = [c, t] to be the interval between vertex v^ c and vertex v^ t - 

(3) For each process component G iy compute b iy the maximum overlap 
over all intervals ij jt . 

(4) B = {b±, 62, b n } is the optimal nonblocking token assignment. 

Fig. 8. Algorithm for computing an optimal nonblocking buffer assignment. 

interval L^ t - Each interval must have a buffer to ensure nonblocking sends. 
Hence, the minimum number of buffers, b i} is the maximum overlap over all 
intervals within process pi . 

Computing the terminal communication dependencies for G can be done via 
dynamic programming in 0(|V|n) time, where V is the vertex set of G and n 
is the number of process components. If there exists a path from vertex v ijC to 
Vj t d, then there exists a path from t> iiC to all vertices Vj t d+k, k > 0. Associate 
with each vertex Vi tC an integer vector aj iC of size n; aj iC [j] = d means that 
there exists a path from Vi jC to Vj t a, and thus to Vj t d+k, k > 0. The vector 
Oj ;C is computed by taking the elementwise minimums over the vectors of the 
adjacent vertices v ijC ] this is simply a depth-first traversal of G. Since the 
number of arcs is bounded by 3|V|/2 and the pairwise comparison takes n 
steps, the traversal takes 0(|V|n) time. 

Next, computing the 0(|V|) intervals, I i>t , requires one table lookup per in- 
terval. To compute the maximum overlap we sort the intervals and perform a 
sweep, keeping track of the current and maximum overlap; this takes 0(| V| log | V|) 
time. Thus, the total complexity is 0(|V|n + \V\ \og\V\) time. In the worst 
case, where p ~ |V|, this algorithm is quadratic. However, in practice n is 
usually fixed, in which case the \V\ log \ V\ term dominates. 



5.3.1 Proof of Correctness of the Nonblocking Buffer Allocation Algorithm 

Lemma 5.8 Let G be a communication graph. For all vertices v^ c ,Vj t d £ G; 
tf v j,d is a send vertex and there exists a path from the vertex v i;C to vertex Vj^, 
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then vertex Vj^ cannot be coloured yellow until vertex Vi jC is coloured green. 

Proof: By rule send-^yel, the predecessor of Vj t d must first be coloured green 
before Vj^ can be coloured yellow. Since rules send—>grn, and recv^ grnimply 
that the predecessors of a green vertex must be green, the result follows. ■ 

Corollary 5.9 Let G, v^ c , and Vj t d be as in Lemma 5.8 and let v^t be the 

receive vertex corresponding to the send vertex Vj t d- Rule recv-^yel will never 
be applied to vertex v itt before vertex Vi )C is coloured green. 

The preceding corollary implies that a token, which is needed to colour the 
receive vertex Vi tt yellow, need not be available until the vertex on which is 
terminally communication dependent is coloured green. Hence, it is sufficient 
to ensure token availability just before colouring the respective send vertex 
green; this is also necessary. 

Theorem 5.10 Given G, let v ijC be a send vertex and v ijt be a receive vertex 
that is terminally communication dependent on vertex v i>c . A token for the 
application of rule recv-^yel on arc (vj t d, v ijt ) must be available as soon as 
vertex Vi )C is coloured green. 

Proof: Let Vj t a be the send vertex corresponding to the receive vertex v^t 
and let Q = {v iiq | c < q < t} be the set of vertices that are predecessors of 
v ijt , but on which v ijt is not communication dependent. 

Since v ijt is not communication dependent on the vertices in Q, we can con- 
struct a colouring sequence on G that fixes the vertices in Q to be red, and 
colours vertex Vj t d yellow, making the application of rule recv-^yel possible in 
the next step. Since no progress is made in the ith process component after 
colouring vertex v ijC green, the state of the associated token pool does not 
change until the application of rule recv-^yel to vertex v ijt . Hence, when ver- 
tex v i tC is coloured green, the token pool must have a token destined for arc 
(vj,d,v i:t ). m 

Thus, if a receive vertex r is terminally communication dependent on a send 
vertex s, then it is necessary and sufficient that a token, which is used to 
apply rule recv-^yel to receive vertex r, must be available as soon as the send 
vertex s is coloured green; the start vertex may be thought of as a special send 
vertex. Since the interval corresponding to r begins when s is coloured green, 
and ends when r is coloured green, a token must be available for the recv-^yel 
rule, which can occur during this interval. Computing the maximum overlap 
of intervals yields the required number of tokens. 
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5.3.2 Example Use of the NBAP r Algorithm 



To demonstrate the NBAP r algorithm we have implemented it, and analyzed 
the pipe-and-roll parallel matrix multiplication algorithm [10]. The program 
has one control process and a number of worker processes arranged in a 2 
dimensional mesh. We ran the NBAP r algorithm on meshes of size 2 x 2, 3 x 3 
and 4x4. The communication graph for the smallest example, comprising four 
workers ordered in a 2 x 2 mesh, is depicted in Figure 9. The corresponding 
optimal buffer assignment is listed in the second column of Table 1. 
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Fig. 9. The communication system for a 2 x 2 worker process mesh. 

In this example, process is the control process and processes 1 through 4 are 
the workers. The control process needs 4 buffers and the workers each need 3 
to execute without blocking. The results obtained when executing the NBAP r 
algorithm on a 3 x 3 worker system is 9 buffers for the control process and 
between 4 and 5 buffers for the worker processes. For the 4x4 system the 
numbers are 16 for the control process and between 5 and 7 buffers for the 
workers. 



5. 3. 3 Approximating B AP r with NBAP r 

The NBAP r algorithm is useful for determining a token assignment that pre- 
vents deadlock, that is, approximating BAP r . Since a nonblocking colouring 
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Table 1. The result of running the NBAP r algorithm on the 2x2 worker example. 

sequence does not deadlock, a token assignment determined by the NBAP r 
algorithm ensures that the graph is deadlock free. However, the token assign- 
ment may be far from optimal. A simple example of this phenomena is a two 
process component graph comprised of n arcs emanating from the first com- 
ponent and incident on the second. Such a graph requires zero tokens to avoid 
deadlock, but requires n tokens to be block free. Thus, the aforementioned 
token assignment may entail many more tokens than required. 



6 Buffer Allocation in Systems with Send Side Buffers 



In this section we consider the second of the four buffer placement strategies: 
send side buffers. Buffers are now allocated on the sending process side if the 
receive is not ready to accept the message. Correspondingly, the token pool 
used when applying rule recv-^yel to the receive vertex of arc (s,r) belongs 
to the process component containing the send vertex s. We call this the send 
side allocation scheme. 

The Buffer Allocation Problem (BAP S ) remains intractable. The problem 
is conjectured to be NP-complete (see the following paragraph). The NP- 
hardness follows from the observation that each t-ring in the construction in 
Theorem 5.1 has to have a token assigned to a process component pair in 
order to prevent deadlock. It does not matter if the token is allocated from 
the token pool of the sending or the receiving process component. Hence, the 
reduction used in Theorem 5.1 can be applied with no modification. 

We conjecture that the corresponding Buffer Sufficiency Problem (BSP S ) is in 
P. This is because the relative order in which tokens from a particular token 
pool are utilized is invariant with respect to the colouring sequences. Hence, 
we believe that the determining sufficiency is similar to the nonblocking buffer 
allocation problem and hence is in P. If this is the case, BAP S is NP-complete. 

The Nonblocking Buffer Allocation Problem (NBAP S ) remains in P. The prob- 
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lem can be solved by first reversing all arcs in the communication graph, swap- 
ping the start and end vertices, and then running the algorithm described in 
Figure 8. 



7 Buffer Allocation in Systems with Send and Receive Side Buffers 

So far we have considered systems exclusively with send side or receive side 
buffers. In this section we investigate systems with buffers on both the send 
and the receive sides; many communication systems use per-host buffer pools 
for both receiving and sending messages. The choice of where to buffer the 
message — on the sender or on the receiver — increases the difficulty of deter- 
mining the system's properties. 

We assume a lazy mechanism for utilizing buffers: first use a buffer from 
the sender's pool. If none is available, use a buffer from the receiver's pool. 
If neither is available, attempt to free a send side buffer by transferring its 
contents to a buffer belonging to the corresponding receiver. Intuitively, the 
system attempts to maximize buffer use, without attempting to predict the 
future. 

The corresponding colouring game allows tokens to be allocated from the 
pools belonging to both the sending component and the receiving component. 
Correspondingly, a lazy token utilization scheme is used: let (si,rj) be a com- 
munication arc from process component Pi to process component Pj. The 
following rules apply during the application of rule recv-^yel to vertex rf 

(1) If a token belonging to component P« is available, use it. 

(2) Otherwise, if a token belonging to component Pj is available, use it. 

(3) Otherwise, if a token belonging to component Pi is currently placed on 
arc (ti,rk), U G Pi, r\ G Pk, and a token belonging to component Pk is 
available. Then the token on arc {U,rk) may be replaced with the one 
belonging to Pk, freeing a token to be used in the current application of 
rule recv-^yel. 

We call this the mixed allocation scheme. 

Not unexpectedly, the Buffer Allocation Problem (BAP sr ) remains intractable 
within the mixed allocation scheme. This is because the receive side allocation 
scheme, which provides no choice of token pools, can be simulated within 
the mixed allocation scheme. Concretely consider the receive side allocation 
scheme analyzed in Section 5: to simulate the receive side allocation scheme 
on communication graph G, within the mixed allocation scheme, each arc in 
G is replaced by the widget illustrated in Figure 10. Since vertex q cannot 



22 



be coloured green until vertex r is coloured yellow, and component P' has no 
tokens, applying rule recv-^yel to r requires that Pj has an available token, 
regardless of whether Pj has an available token. 



i — i/ q,> -* — <► 



Fig. 10. Nullifying send side token pools. 

Similarly, the Buffer Sufficiency Problem (BSP sr ) within the mixed allocation 
scheme is also coNP-complete. The hardness follows from Theorem 5.2 and 
the preceding argument. Since a colouring sequence also serves as a deadlock 
certificate in this case, the coNP-completeness result follows. 

The interesting property of the mixed allocation scheme is that the Nonblock- 
ing Buffer Allocation Problem (NBAP sr ) is intractable; the choice of token 
pools increases the search space of solutions exponentially! The reduction is 
from 3SAT. 

Theorem 7.1 The Nonblocking Buffer Allocation Problem (NBAP sry ) is NP- 
hard. 

Proof: Let F be an instance of 3SAT, comprising n variables, labeled Xi, 
i = 1 . . . n, and c clauses. We construct a communication graph G such that 
there exists a token assignment of n + 2 tokens that prevents any colouring 
sequence from blocking on G if and only if the corresponding assignment 
satisfies F. 



The graph G comprises 2n + 3 process components: the first 2n are labeled P Xi 
and P Xi , % — 1 . . .n, and the remaining three process components are labeled 
P, Qo and Qi, respectively. The graph is divided into c + 1 epochs: epoch 
corresponds to the variable assignment, and epochs 1 through c correspond to 
clause evaluation. 

In epoch each process component P Xi contains a single send vertex Sj that is 
adjacent to the receive vertex V{ located in epoch of process component P Xi . 
Process component Qo (and Qf) contains four vertices: two receive vertices 
q 0) i and g ,2 (respectively g^i and 91,2), followed by two send vertices i ,i an d 
t ,2 (respectively t^i and £1,2)- Finally, process component P contains eight 
vertices: two send vertices, so,i and so,2, that are adjacent to vertices go,i an d 
Qo,2] two receive vertices, ro,i and ro,2, that are adjacent to to,i an d £0,2! two 
more send vertices, si ; i and si^, that are adjacent to q± t i and qi^] and two more 
receive vertices, rn and r\ 2, that are adjacent to tn and ti 2 . See Figure 11. 
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Epoch has two important properties. 

Property 7.2 Any token assignment must assign at least one token to either 
component P x . or P x . to prevent the colouring sequence from blocking after 
colouring vertex Sj yellow. 

Property 7.3 A token assignment on G having only n + 2 tokens must as- 
sign two tokens to process component P to prevent a colouring sequence from 
blocking after yellow colouring one of the send vertices s ,i, s Q)2 , Si,i or s 12 . 

Proof: Since n tokens must be allocated to the process components P Xi or P Xi , 
i = 1, ... ,n, this leaves only two tokens to be allocated. Since the colouring 
rule sequence send-^yel, recv—>yel, send^grn, send—>yel, recv-^yekan colour 
send vertices So,i an d So,2, or send vertices s^i and s^a, component pairs 
(P, Qo) and (P, Qi) must each have two tokens between them. This can only 
happen by assigning the tokens to P. ■ 



A corollary of these properties is that once a legal token assignment is made, 
no colouring sequence will block in epoch 0. The choice of allocating the token 
on P Xi versus P Xi corresponds to fixing the variable assignment. 



s 




Fig. 11. Reduction from 3SAT to NBAP sr . 

For j — 1 . . . c, epoch j corresponds to the jth clause. Each epoch comprises 
two parts of six arcs each: the synchronization part and the evaluation part. 
Four process components are involved in an epoch: the three components, P a . , 
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P bj , and P Cj , whose labels are the literals in the jth clause, where a,j, bj, Cj G 
{x\,Xi, . . . , x n , x n }, and component P, which is involved in every epoch. Epoch 
j of component P aj comprises four vertices: receive vertex r aj j, send vertex 
t a . j, receive vertex r' -, and send vertex t' -. Process components P b and 
P Cj are analogously formed. 

In epoch j component P has 12 vertices, the first six are these: send vertex 
s a . j , receive vertex g a j, send vertex s b .j, receive vertex qb j, send vertex s c .j, 
and receive vertex q Cj j- These are followed by three send vertices: s' a j, s' b j, 
and s' ,-, and three receive vertices: o' ,-, q' h ,-, and o' ,-. 

Each vertex sjj is adjacent to vertex r/j, each vertex tij is adjacent to vertex 
qij, each vertex ^ is adjacent to vertex and each vertex ^ ■ is adjacent 
to vertex g z ' •; see Figure 11. For conciseness we drop the last index, j, if it is 
obvious from the context. Epoch j has three important properties: 

Property 7.4 If vertex q' c . (in epoch j) is coloured green and vertex s aj+1 (in 
epoch j + 1) is still red, then no tokens that belong to component P are assigned 
to arcs. The same applies to vertex pairs (q a ., s^), (q^, s Cj ), and (q Cj , s' a ), also 
in epoch j . 

Proof: All ancestors of q' must be coloured green and all descendants of s aj+1 
must be coloured red. This includes all vertices in G, except some vertices Sj 
and r, in epoch 0, which are not adjacent to vertices in component P. Hence, 
the tokens belonging to P are not assigned to any arc. The same argument 
applies to the other vertex pairs. ■ 

Property 7.5 A colouring sequence on G can block only when yellow colour- 
ing receive vertices r' a ., r' bj , r' Cj , q' a ., q' bj , or q' Cj . 

Proof: As a corollary of properties 7.2 and 7.3, no colouring sequence can 
block in epoch 0. Thus, we need only check that no colouring sequence can 
block in the first part of epoch j , j = 1 . . . c. 

By property 7.4, if s a . is red and its predecessor is green, then no tokens of 
P are in use. Hence, to colour s aj green, a token is available to colour r aj 
yellow. Since vertex r aj is a predecessor of t a ., vertex r aj must be coloured 
green before t aj may be coloured yellow. Thus the token is freed before t aj 
is coloured green, and may be used to colour vertex q a . yellow after t a . is 
coloured yellow. A similar argument applies to the vertices r bj , q bj , r Cj , and 
q Cj - ■ 

Property 7.6 A colouring sequence can block in epoch j if and only if none 
of the three process components, P aj , P bj , and P Cj , have a token assigned. 

Proof: For the 'if direction consider a colouring sequence that colours vertex 
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q c . green, but has not yet coloured vertex s' yellow. By definition, blocking 

does not occur, if rule recv^-yel may always be applied to colour a receive 
vertex yellow. To colour the send vertices s' , s' bj , and s' yellow and then 
green, the receive vertices r' a , r' b , and r' c , must be coloured yellow via rule 

recv—>yel. Since the receive vertices r' a , r' b , and r' c are not ancestors of the 
send vertices s' , si , and s' , none of the receive vertices need be coloured 
green before the send vertices are coloured yellow. However, component P has 
only two tokens, and none of components P aj , P bj , P Cj have any. Hence, rule 
recv-^yel can only be invoked twice, instead of the requisite three times. Thus, 
a colouring sequence can block in epoch j. 

For the 'only if direction we claim that if a literal component P ., P bj , or P Cj 
has a token, rule recv-^yel can be invoked on any of the six receive vertices 

r 'ap r V T 'v q '*P alld q 'cr SillCe ^ iS a P redecessor of t'ap r 'a 3 mUst be 

coloured green before t' a , and hence before q' is coloured yellow. Thus, the 

same token that was allocated upon the application of rule recv-^yel to vertex 
r' a ., may also be allocated upon the application of rule recv-^yel to vertex 
q' a .] the same argument is applicable to vertices q' b and q' c _. Applying rule 

recv-^yel to vertices r' a and r' b , uses the two tokens from component P. To 
colour vertex r' c yellow there are three possible scenarios: 

(1) the colouring sequence has already freed one of the tokens, allowing it to 
be reused, 

(2) component P Cj has a token, in which case it is used, or 

(3) component P a (or P b .) has a token, in which case it replaces the token 
used to yellow colour vertex r' a (or r' b .) and the freed token is used to 
colour vertex r' . 

Since at least one component P aj , P bj , or P Cj have a token, the claim is proven. 
■ 

By property 7.6 a colour sequence will block in epoch j if and only if none of 
the process components P a , P b ., or P c has a token, which corresponds to the 
jth clause having no literals that are true. Thus, a token assignment of size 
In + 2 prevents any colouring sequence on G from blocking if and only if the 
corresponding assignment satisfies F. ■ 



8 Buffer Allocation in Channel Based Systems 

In channel based systems processes communicate via pairwise connections 
that are created at start-up. Each connection, called a channel, is specified 
by its end-points and is used by one process to send messages to the other. 



26 



Each channel functions independently of other channels in the system, and 
resources such as buffers are allocated on a per channel basis, rather than per 
process. Finally, channels behave like queues, that is, messages are removed 
from the channel in the same order that they are inserted. 

Channels may either be unidirectional, comprising source and destination end- 
points, or bidirectional, comprising two symmetric end-points. In the former 
case, only the source process may insert messages into the channel and only the 
destination process may remove messages from the channels. A bidirectional 
channel is equivalent to two unidirectional channels, allowing both processes 
to insert and remove messages from the channel. Here we only consider uni- 
directional channels. 

Except for buffer allocation, channel based communication does not differ 
from the previously described send/receive mechanism. In fact, an unbuffered 
channel communication is just a synchronous send/receive communication. 
Thus, we can derive similar results for channel based systems. 

In the corresponding colouring game tokens, are allocated to channels (com- 
ponent pairs) instead of to components. This change does not change the 
properties used in our proofs. In fact, Lemma 4.1 may be used unchanged. We 
call this the per channel allocation scheme. 

8.1 The Buffer Allocation Problem 

The corresponding Buffer Allocation Problem (BAP sr ) is this: given a com- 
munication graph G and an integer k, determine whether there exist a token 
assignment of size k, such that no colouring sequence deadlocks on G. Even 
though token utilization, during the colouring of a communication graph, is 
only dictated by the communication arcs within a process component pair, de- 
termining the number of tokens needed remains NP-hard. The proof is similar 
in spirit to Theorem 5.1. 

Theorem 8.1 The Buffer Allocation Problem (BAP sr ) is NP-hard. 

Proof: We prove this by reducing 3SAT to BAP sr . For any 3SAT instance 
F we construct a corresponding communication graph G — polynomial in size 
of F — such that for a token assignment of size n, any colouring sequence will 
complete on G if and only if the corresponding variable assignment satisfies 
F. 

Let F be an instance of 3SAT on n variables and comprising c clauses. The 
construction is nearly identical to that in Theorem 5.1, except for the widgets 
representing the clauses of F. The graph G has 2n process components that 
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are labeled by the literals of F, P Xi and P Xi , % = l...n. Each component 
comprises c + 1 epochs, where each epoch contains zero or two vertices. 

As in Theorem 5.1, epoch fixes a variable assignment. In epoch each 
component has two vertices: a send vertex, labeled s Xi (or s Xi ), and a receive 
vertex r xv (respectively r Xi ), % — 1 . . .n. Vertex s Xi is adjacent to vertex r Xi , 
and vertex s Xi is adjacent to vertex r Xi ; this is a 2-ring, identical to epoch 
in Theorem 5.1. Epoch has the the following property: 

Property 8.2 Any colouring sequence on G will deadlock in epoch unless 
each process component pair has a token assigned to the token pool of either 
(P X .,P X .), or (P X .,P X .), i = 1 . . .n. Thus, the token assignment must be of at 
least size n. (Follows from Lemma 

Property 8.2 yields the following correspondence between assignments on F 
and token assignments of size n. 

Property 8.3 The corresponding token assignment of a variable assignment 
on F assigns a token to the channel (P Xi ,P Xi ) if Xi is true, or to (P Xi ,P Xi ) if 
Xi is false. 

The jth epoch represents the jth clause of F, denoted (aj,bj,Cj), where 
a,j, bj, Cj e {xi, x ± , . . . , x n , x n }. The process components P aj , P aj , P bj , P bp P Cj , 
and P 5 form a 6-ring, while the remaining components have no vertices in 
the jth epoch. Process component P a has two vertices in the jth component: 
a send vertex, s aj j, and a receive vertex r aj j; similarly, the other five com- 
ponents have a send and receive vertex that are correspondingly named. The 
arcs linking the 6 components are these: (s aj j,r aj j), (s aj ,j, r bjd ), (s bj j,r%. d ), 
(si j:j ,r Cjtj ), (s Cj j,r £jJ ), and r aj J). These form a 6-ring, as illustrated in 
Figure 12. The key property of the jth epoch is this: 




Fig. 12. The clause representation in epoch j. 

Property 8.4 No colouring sequence on G will deadlock in the jth epoch if 
and only if at least one of the channels has a token: (P a .,P a ), (P a ,P b ), 
(P bj ,P- bj ), (P- b .,P Cj ), (P Cj ,P ej ), (P- Cj ,P aj )- (Follows from Lemma 4-1.) ' 

A refined version of property 8.4 is more useful: 

Property 8.5 For any token assignment of size n such that no colouring 
sequence deadlocks on G in epoch 0, no colouring sequence on G will deadlock 
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in the jth epoch if and only if at least one of the channels (P aj , Pa,), (Pbj, Plj)> 
and (Pc^Pcj), has a token. 

Proof: By property 8.2, all token assignments that do not cause deadlock 
in epoch only assign tokens to channels of the form (P X .,P S .) or (P S .,P X J. 
Hence, only channels (P aj , Paj), {Pbj, Plj)i an d {Pep Pcj) can have a token. By 
property 8.4, no colouring sequence on G will deadlock in epoch j if one of 
these channels has a token. ■ 

We claim that given a token assignment of size n, any colouring sequence will 
complete on G if and only if the corresponding variable assignment satisfies 
F. 

If an assignment x satisfies F, then every clause has at least one literal that 
evaluates to true. By Property 8.3, in each of the j epochs at least one of 
the channels listed in Property 8.5 will be allocated a token. Hence, by Prop- 
erty 8.5 no colouring sequence will deadlock on G. 

If an assignment x does not satisfy F then there is at least one clause in which 
all literals are false. Let (a,, bj, Cj) be the unsatisfied clause. By property 8.3, 
the corresponding token assignment will not assign a token to (P a .,P a .), 
(PtyjPl ), or (P Cj ,P 5j ), hence, by Property 8.5, all colouring sequences will 
deadlock. 

Thus, NBAP sr is NP-hard. ■ 

Since tokens are assigned on a per channel basis, token usage depends only 
on the two process components that comprise the channel. Consequently, the 
sufficiency of a token assignment can be verified in linear time. Thus, the easier 
problem BSP sr is in P, implying that BAP sr is NP-complete. We describe the 
verification algorithm and prove its correctness. 

To verify the sufficiency of a token assignment, perform a colouring on G: at 
each step of the colouring a vertex of G is coloured according to the rules 
in section 3. Using a queue to keep track of colourable vertices, means that 
determining which vertex to colour next takes 0(1) time. Since each vertex 
changes colour at most twice — the maximum length of any colouring sequence 
is 2\V\ colourings — colouring a graph takes 0(| V|) time. The token assignment 
is sufficient if and only if the colouring sequence completes. The algorithm's 
correctness follows immediately from the following theorem: any colouring se- 
quence on G completes if and only if some colouring sequence on G completes. 
Thus, a token assignment is sufficient if and only if some colouring sequence 
on G completes. 

Theorem 8.6 Let G be a communication graph and B a token assignment on 
G. Any colouring sequence on G completes if and only if a colouring sequence 
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on G completes. 



Proof: For any communication graph G, we construct a new graph G' where 
every token is simulated by a process component, the size of the corresponding 
token assignment is zero, and every colour sequence on G corresponds to a 
colouring sequence on G', such that a colouring sequence on G completes if 
and only if the corresponding colouring sequence on G' completes. Since the 
token assignment on G' is zero, by Lemma 4.2 a colouring sequence on G' 
completes if and only if every colouring sequence on G' completes. Hence, 
every colouring sequence on G completes if and only if a colouring sequence 
on G completes. 

To simulate an m token channel (a channel that has been assigned m tokens) m 
process components are chained together. For each channel (P, Q) with m to- 
kens, m process components Pi, P 2 , . . . , P m are interspersed between P and Q. 
The channel (P, Q) is replaced with these channels: (P, Pi), (Pi, P2), . . . , (P m _i, P m ), (P m , Q)- 
Each arc from P to Q is replaced by a chain of arcs from P — > Pi — > P2 — > 
. . . — > P m _i — > P m — > The replacement is illustrated in Figure 13. 



p q p Pt p m q 




Fig. 13. Simulating m tokens by m components. 



We claim that a colouring sequence, E, on G will deadlock if and only if the 
corresponding colouring sequence, £', on G' deadlocks. First, we construct the 
correspondence and argue its correctness. Second, we argue that sequence £ 
deadlocks on G if and only if the corresponding sequence £' deadlocks on G'. 
Finally, we apply Lemma 4.2 to prove our result. 

Since the transformation is iterative — each m token channel is independent of 
the other channels — it is sufficient to derive the correspondence between the 
colouring sequence on G and the graph G' where a single m token channel has 
been replaced. Let (P, Q) denote the channel in G that is replaced in G' . 

Let (s/, r{) e G, I — 1, 2, . . ., denote the arcs from process component P to Q. 
The corresponding paths in G' are 

(si, r i,/, Sij, r 2 ,i, s 2 ,i, . . . , r m:h s m> ; , r 

Pi P2 Pm 
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where each arc (r^/, Sk,i) is within process component Pk and each arc (sk,i, fk+1,1) 
is between process components Pk and Pk+i', the vertices si and ri, I — 1, 2, . . . 
are called the fringe vertices. 

A colouring sequence £ can be represented as a sequence of differences (or 
moves), Si, between every two consecutive colourings \% an d Xi+i- The sequence 
As = 6162 ... is a sequence of colouring game moves 5i = (v, colour) such that 
applying 5i to colouring Xi yields Xi+ii the next colouring in E; As can be 
derived from £ and, £ can be derived from A s and G. The sequence A s 
comprises two types of moves: those that colour fringe vertices, called fringe 
moves, and those that do not, called normal moves. 

Given a colouring sequence £ on G, we transform it into the corresponding 
colouring sequence £' on G' . The transformation replaces some fringe moves 
in sequence As with sequences of moves, resulting in the corresponding move 
sequence As'. This sequence comprises normal moves and added moves; 
added moves are a mixture of fringe moves and moves on the vertices within 
the added components Pj. There are four types of fringe moves in A s : colour 
a send vertex si yellow ((s;,yel)), colour a send vertex si green ((s z ,grn)), 
colour a receive vertex r\ yellow ((rj, yel)), and colour a receive vertex r% green 
((ri, grn)). The transformation is performed in the order that the moves occur 
in sequence A s . 

• If Si — (sj,yel), then no action is taken. 

• If 8i — (s;,grn), we replace it with the sequence 

(?"U,yel), (si,grn), (r ljZ ,grn), (s^^yel), 

suffixed by the sequences 

( r i,h yel), (sj-1,1, grn), (r jh grn), (s jh yel), j = 2 . . . k - 1 

where k is the smallest integer such that the move (s^i-i, grn) has not yet 
been inserted into the move sequence A s , that is, vertex s^i-i has not yet 
been coloured green. 

• If 8i — (rj,yel), we remove it from the sequence; it is restored when we 
replace the move (r; , grn) . 

• If Si = (r^grn) we replace this move with the sequence 

(n,yel), (s m>J ,grn), (n,grn), 

suffixed with the sequences 

(r fl .^.,yel), (s 9j -i, hj , grn), (r 9jthj , grn), (s gjthj+1 ,yel), j = 0...k-l, 

where gj = m— j, hj — 1 + 1+ j, and k is the smallest integer such that the 
move (sm-fc.z+i+fcj yel) has not yet been inserted into the sequence, that is, 
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vertex s m -k,i+i+k has not yet been coloured yellow. Since the head of this 
sequence colours s m> ; green, r m ^ +1 could be coloured yellow, if s m -i,i+i is 
yellow, then s m -i,i+i could be coloured green followed by r m j +1 and finally 
s m ,i+i could be coloured yellow; this colouring cascades down the added 
process components. 

It is important to note that each of the replacement sequences is maximal, 
that is, no additional valid colouring moves on the chain process components 
Pi, i = 1 . . . m, may be suffixed to them. The new sequence looks like this: 

normal moves normal moves 

Ajy = 5 ± . . . 5 hl 5[... S' gi 5 hl+1 ...S ha S' gi+1 ...5' g2 .... 

added moves added moves 



Since G is a contraction of G', all normal vertices are coloured by Avy in 
the same order as in As- Recall that normal vertices are not adjacent to the 
process component chain, and hence, are not affected by the transformation. 
While normal vertices within process components P and Q may depend on the 
order that the fringe vertices are coloured, the dependence is via process arcs, 
not communication arcs. Consequently, the normal vertices only depend on 
the order that the fringe vertices are coloured green. Fortunately, this order is 
preserved. By inspection, the replacement sequences of moves are valid. Thus, 
the transformed sequence Avy is valid. Additionally, all green colouring moves 
on fringe vertices are preserved by the transformation; a vertex is coloured 
green by A^ if and only if the corresponding vertex is coloured green by Avy. 
The following property is key: 

Property 8.7 A 2 deadlocks on G if and only i/Avy deadlocks on G' . 

Proof: By contradiction, suppose that As deadlocks on G while Avy can be 
extended, that is, another vertex colouring move may be suffixed to Avy. Let 
v be the vertex that can be coloured by the extension. Vertex v may either be 
a normal vertex, a fringe vertex, or a vertex belonging to a process chain. The 
latter is impossible because every replacement sequence of moves is maximal. 

If v is a normal vertex, then its predecessors are either a normal vertex or a 
fringe vertex that has been coloured green. Since the transformation preserves 
the colourings of normal vertices and the order in which vertices are coloured 
green, if Avy can be extended by colouring v, then so can As, which is a 
contradiction. 

If v is a fringe vertex, there are four cases: either v is a send vertex si being 
coloured yellow or green, or v is a receive vertex r\ being coloured yellow 
or green. The transformation does not affect moves that colour send vertices 
yellow and such a colouring only depends on its process component predecessor 
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being green. Hence, if the colouring can be suffixed to Ajy, it can also be 
suffixed to A^; resulting in a contradiction. If the extension colours the send 
vertex green, this means that the original sequence can be extended by 
either adding the colourings (si, grn) or (r h yel)(sj, grn), depending on whether 
r\ has been coloured yellow or not in the original sequence Ag; thus, it is a 
contradiction. 

Similarly, if v is a fringe receive vertex being coloured green, this is not possible, 
because the transformation colours fringe receive vertices yellow, then green, 
by a single replacement sequence. Finally, if v is a fringe receive vertex r; that 
can be coloured yellow, the original sequence A s can be extended by the move 
(ri, grn), because in the original sequence the corresponding send vertex si has 
already been coloured green. Thus, we have another contradiction. 

In the other direction, if the original sequence can be extended, then trans- 
forming the extension of the sequence A^ yields an extension to the presum- 
ably deadlocked sequence A^y. Thus, As deadlocks on G if and only if A^y 
deadlocks on G'. m 

A corollary of Property 8.7 is that the colouring sequence £ deadlocks if and 
only if the colouring sequence £' deadlocks. 

By Lemma 4.2 a colouring sequence on G' completes if and only if all colouring 
sequences on G' complete. Hence, a colouring sequence on G completes if and 
only if all colouring sequences on G complete. ■ 

Corollary 8.8 A colouring sequence on G completes if and only if the token 
assignment is sufficient. 



8.2 The Buffer Allocation Problem 



For the Nonblocking Buffer Allocation Problem, the algorithm derived in sec- 
tion 5.3 suffices with a small modification. Since the token pools are per chan- 
nel, rather than per process component, the computation must be performed 
on a per pool basis. Hence, there is an additional factor of n in the runtime. 
Since each process may be using up to n channels, the runtime of the algorithm 
becomes 0(|^|n 2 + |V|nlog (|V|n)); the cost increases because the number of 
allocations to be made becomes quadratic in n. 
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9 Conclusion 



As message passing becomes increasingly popular, the problem of determining 
fc-safety plays an increasingly important role. The relevance of this problem is 
grows as more and more functionality of message passing systems is off-loaded 
to the network interface card, where limited buffer space is a serious issue. 
Even if message passing is kept in main memory, buffer space can still be 
limited due to the sometimes very large data sets used in many parallel and 
distributed programs. Unfortunately, determining fc-safety is intractable. 

We have shown that in the receive buffer model, determining the number of 
buffers needed to assure safe execution of a program is NP-hard, and that 
even verifying whether a number of assigned buffers is sufficient is coNP- 
complete. On the positive side, if we require that no send blocks, we provide a 
polynomial time algorithm for computing the minimum number of buffers. By 
allocating this number of buffers, safe execution is guaranteed. In addition, we 
have implemented the NBAP r algorithm, and it is now part of the Millipede 
debugging system [19]. 

For systems with only send buffers, the Buffer Allocation Problem remains 
NP-complete. In addition, we conjecture that the Buffer Sufficiency Problem 
can be solved in polynomial time because the order of the sends in each process 
is fixed. The Nonblocking Buffer Allocation problem for systems with only send 
buffers can be solved in polynomial time. 

For systems with both send and receive buffers, the Buffer Allocation Problem 
as well as the Buffer Sufficiency Problem remain intractable. More interest- 
ingly, the Nonblocking Buffer Allocation problem has become intractable. 

For systems with unidirectional channel buffers, both the Buffer Sufficiency 
Problem as well as the Nonblocking Buffer Allocation Problem have polyno- 
mial time algorithms. However, the Buffer Allocation Problem still remains 
an NP-complete problem. The results (conjectures) are summarized below. 
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9. 1 Strategies for Reducing Buffer Requirements 

There are several strategies that a programmer can use to reduce the likelihood 
of deadlock when only a few buffers are available. 

The obvious solution is to use synchronous communication, which does not 
require any buffers at all. However, this is not always desirable. 

For efficiency reasons asynchronously buffered communication is often pre- 
ferred. To decrease the risk of deadlock the programmer can introduce epochs 
that are separated by barrier synchronizations. This might reduce the number 
of buffers needed for each epoch, as no buffer requirement spans an epoch 
boundary. If each epoch only needs a small number of buffers, the risk of 
deadlock due to buffer insufficiency is reduced. 

References 

[1] V. Anantharm. The optimal buffer allocation problem. IEEE Transactions on 
Information Theory, 35(4):721-725, 1989. 

[2] J. Bruck, D. Dolev, C. Ho, M. Rosu, and R. Strong. Efficient Message Passing 
Interface (MPI) for Parallel Computing on Clusters of Workstations. In 7th 
Annual ACM Symposium on Parallel Algorithms and Architectures, pages 64 - 
73, Santa Barbara, California, July 1995. 

[3] G. Burns and R. Daoud. Robust MPI Message Delivery with Guaranteed 
Resources. MPI Developers Conference at the University of Notre Dame, June 
1995. 

[4] B. Charron-Bost, F. Mattern, and G. Tel. Synchronous, asynchronous, and 
causally ordered communication. Journal of Distributed Computing, 9(4):173- 
191, 1996. 

[5] S. A. Cook. The complexity of theorem-proving procedures. In Proceedings of 
the 3rd Annual ACM Symposium on the Theory of Computing, pages 151-158, 
1971. 

[6] R. Cypher and E. Leu. Repeatable and portable message-passing programs. In 
Proc. of The Symposium on the Principles of Distributed Computing (PODC), 
pages 22-31, 1994. 

[7] R. Cypher and E. Leu. The semantics of blocking and nonblocking send and 
receive primitives. In Proceedings of 8th IEEE International parallel processing 
symposium (IPPS), pages 729-735, 1994. 

[8] J. Dongarra. MPI: A message passing interface standard. The International 
Journal of Supercomputers and High Performance Computing, 8:165-184, 1994. 



35 



[9] J. Dongarra, R. Hempel, A. Hey, and D. Walker. A proposal for a user-level, 
message-passing interface in a distributed memory environment. Technical 
Report TM-12231, ORNL, June 1993. 

[10] G. Fox, M. Johnson, G. Lyzenga, S. Otto J. Salmon, and D. Walker. Solving 
problems on concurrent processors. General techniques and regular problems, 
volume 1. Prentice-Hall, Inc., 1988. 

[11] D. Frye, R. Bryant, H. Ho, R. Lawrence, and M. Snir. An external user 
interface for scalable parallel systems. Technical report, IBM highly parallel 
supercomputing systems laboratory, November 1992. 

[12] G. Burns, R. Daoud and J. Vaigl. LAM: An Open Cluster Environment for 
MPI. In Supercomputing Symposium '94, Toronto, Canada, June 1994. 

[13] M. R. Garey and D. S. Johnson. Computers and Intractibility: A Guide to the 
Theory of NP- Completeness. W. H. Freeman and Company, New York, 1979. 

[14] A. Geist, A. Beguelin, J. Dongarra, W. Jiang, R. Manchek, and V. Sunderam. 
PVM: Parallel Virtual Machine: A Users' Guide and Tutorial for Networked 
Parallel Computing. Scientific and engineering computation. MIT Press, 1994. 

[15] P. Huber, A. M. Jensen, L. O. Jepsen, and K. Jensen. Reachability trees for 
high-level Petri nets. Theoretical Computer Science, 45:261-292, 1985. 

[16] K. Jensen. Coloured Petri nets. Basic Concepts, Analysis Methods and Practical 
use, volume 1. Springer Verlag, 1992. 

[17] C. Keppitiyagama and A. Wagner. Asynchronous MPI messaging on myrinet. In 
Proceedings 15th International Parallel and Distributed Processing Symposium 
(IPDPS'01). IEEE, 2001. 

[18] L. Lamport. Time, clocks and the orderings of events in a distributed system. 
Communications of the ACM, 21:558-565, 1978. 

[19] J. B. Pedersen. Multi-level Debugging of Parallel Message Passing Programs. 
PhD thesis, University of British Columbia, Canada, 2003. In preparation. 

[20] M. Reiman. The optimal buffer allocation problem in light traffic. In IEEE 
Conference on Decision and Control, 1987. 

[21] T. Sheskin. Allocation of interstage storage along an automatic production line. 
AIEE Transactions, 8(1), 1975. 



36 



